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Abstract — Residual vector quantization (RVQ), or multistage VQ, as it is also 
called, has recently been shown to be a competitive technique for data com- 
pression [1]. The competitive performance of RVQ reported in [1] results from 
the joint optimization of variable rate encoding and RVQ direct-sum codebooks. 
In this paper, necessary conditions for the optimality of variable rate RVQs 
are derived, and an iterative descent algorithm based on a Lagrangian formu- 
lation is introduced for designing RVQs having minimum average distortion 
subject to an entropy constraint. Simulation results for these entropy-constrained 
RVQs (EC-RVQs) are presented for memoryless Gaussian, Laplacian, and uni- 
form sources. A Gauss-Markov source is also considered. The performance is 
superior to that of entropy-constrained scalar quantizers (EC-SQs) and prac- 
tical entropy-constrained vector quantizers (EC-VQs), and is competitive with 
that of some of the best source coding techniques that have appeared in the 
literature. 

Index Terms — Residual vector quantization, multistage vector quantization, en- 
tropy, source coding. 
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1 Introduction 


Residual Vector Quantization (RVQ), or multistage VQ, as it is also called, was orig- 
inally introduced in 1982 [2]. Its structure, which is shown in Figure 1, consists of 
a cascade of VQ stages (hence the name multistage VQ). For the pth stage VQ the 
input vector x p is quantized resulting in the approximation x p . The difference is then 
computed to form the residual * p +i = x p — x p , which serves as an input to the next 
stage. This aspect of the structure motivates the name residual VQ or RVQ. 

Perhaps the most striking benefit of RVQ is its memory efficient structure. An 
RVQ with P stages and N p vectors per stage (1 < p < P) can uniquely represent 
FIp=i A f p vectors with only £ p= 1 N p vectors needed for storage. Furthermore, similar 
savings in computation may be achieved by exploiting the RVQ tree structure. 

Despite these attractive features, RVQ has received little attention until recently. 
Early assessments of its utility, as reported by Baker [3] and in a survey paper by 
Makhoul, et al. [4], were somewhat discouraging. In the former case, some preliminary 
investigations with RVQ structures having more than two stages (applied to image 
coding) led to the conclusion that it is not advantageous to iteratively vector quantize 
image waveform residuals [3, p. 102]. In the latter case, Makhoul, et al. observed a 
rapid degradation in performance for RVQ applied to speech coding as the number 
of stages was increased and suggested that RVQ be limited to not more than two or 
three stages. 

In 1989, Barnes [5] introduced an analysis of RVQ in which the RVQ is optimized 
subject to the imposed structural constraint. The new design method led to an 
improvement in performance over previous design methods. Since then the technical 
literature has shown much activity in the area of RVQ and the application of RVQ 
to data compression has become more widespread [6, 7, 8, 9, 10, 11, 12, 13]. 
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In this paper, we extend the theory and design methods of fixed rate RVQs to the 
case of variable rate RVQ. The first part of the paper (Section 2) follows the work 
presented in [14, 15] where necessary conditions for the optimality of fixed rate RVQ 
are derived. Here, however, we present a mathematical treatment of convergence for 
the RVQ design algorithm. The next part of the paper gives a derivation of optimality 
conditions for variable rate RVQ. It is well known that variable rate systems can yield 
a lower average rate than fixed rate systems. This property has been demonstrated 
in [16, 17] for entropy-constrained VQ (EC-VQ). EC-VQ has shown some of the best 
performance results among entropy coded quantization schemes. In our discussions, 
a theory for entropy constrained RVQ (EC-RVQ) is developed. In addition, a locally 
optimal design algorithm is introduced and convergence issues are addressed. The 
paper concludes with an evaluation and comparison of the performance of EC-RVQ 
on some well-known synthetic sources. Simulation results show that EC-RVQ achieves 
some of the best performance results reported to date. 

2 Fixed Rate RVQ 

The first approach introduced for the design of RVQs consists of using the LBG 
algorithm sequentially on each stage [2]. Although each of the stage codebooks is 
designed to minimize the average distortion introduced by that stage (given fixed 
prior stages), there is no guarantee that the overall average distortion introduced 
by the RVQ is minimized. A better design technique is one that designs the stage 
codebooks jointly to minimize the overall average distortion. The key to optimizing 
the RVQ stages jointly is to view the RVQ in terms of a structurally constrained 
direct-sum codebook (that is, a codebook that contains all possible ordered direct- 
sums of stage code vectors) and find necessary conditions for the optimality of that 
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direct-sum codebook (i.e., joint optimality of all stage codebooks). 

A direct-sum codebook may be depicted in several ways. Here we choose to view 
it diagramatically as a tree. To illustrate this, consider a three-stage RVQ with 
two vectors in each stage codebook: stage 1 contains vectors yi(l)jVi(2); stage 2 
contains vectors 1/2(2); and stage 3 contains 1(3(1)) 1(3(2). Figure 2 shows a tree 
corresponding to this RVQ where the stages are delineated by the dashed lines and 
the stage code vectors appear inside the nodes. Eight nodes appear at the base of 
the tree, each one corresponding to a direct-sum code vector. The value of any one 
of the eight code vectors is obtained by tracing the unique path from bottom to top 
and summing the stage code vectors (shown inside the nodes) along the way. This 
simple tree interpretation is helpful for suggesting efficient RVQ encoder structures, 
and for understanding both the optimality conditions and the corresponding RVQ 
design algorithms. 

Equally important to the discussion is the mathematical notation used to describe 
inputs, outputs, and the various components of the RVQ. Let X\ be a realization of 
the random fc-dimensional vector X\ described by the probability density function 
(pdf) /v (sex) on ft*. A P-stage RVQ (see Figure 1) consists of a finite sequence 
of P vector quantizers. For the pth stage VQ where 1 < p < P, let us define the 
following symbols: 

N p the pth stage codebook size 

j p the pth stage index: {1 < j f < N p ) 

J p the pth set of all possible values for j p : i.e. { 1 , 2 ,..., N p ) 

y (j p ) the jpth code vector of the pth stage 
Sp(jp) the jpth partition cell of the pth stage 

Vp(jp) the jpth stage-removed residual equivalent class of the pth stage 
Cp the pth stage codebook {y p (j P ) : j v € J p ) 

V p the pth stage partition {S p (j p ) : j p € Jp) 

Qp the pth stage quantizer mapping 
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The pth stage VQ quantizes the residual vector x p and outputs Q p (x p ). The pth 
stage quantizer mapping Q p : IR* *-* C p can be realized by a composition of a fixed 
length encoder mapping E p : 9 R* *-* J p where 

E p (x p ) = j p if and only if x p G S p (j p ), 

and a fixed length decoder mapping D p : J p >~* C p where 

Dp(jp) = V p {jp)- 

As stated in the previous section, a P-stage RVQ can be represented by a tree 
as illustrated in Figure 2 . The associated “single-stage” direct-sum VQ codebook 
and the tree-structured RVQ codebook are identical in the sense that they produce 
the same representation of the source output, and thus, have the same expected 
distortion. For the direct-sum VQ, let us define the following symbols: 

N direct-sum codebook size (N = f]p=i Ap) 

J direct-sum P-tuple index set, J = J\ x J2 x • • • x Jp 

j a P- tuple index in J 
y(j) jr'th direct-sum code vector 
V(j) j'th direct-sum partition cell 
C direct-sum codebook {y{j) : j € J} 

P direct-sum partition { V(j ) : j G J} 

Q direct-sum mapping 

The direct-sum codebook contains all possible ordered sums of the stage code vectors, 
i.e., C = Ci + Ci + ...+ Cp. The direct-sum code vectors are given by 

v(i) = E V»»(ip)’ 

p=i 

where j p is the pth member of the ordered P-tuple index j. The direct-sum VQ 
quantizes the source vector *1 and outputs the representation X\ = Q(x 1) given by 

Q( x 1) = Xrf Qp( x p)i 

p=i 
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where 


p-i 

= *1 “ S <?;(*.)> P > 1> 

1=1 

is the pth stage causal residual The term causal refers to the sequential process used 
to compute the residual, i.e., the stage residuals are computed sequentially starting 
from the first stage to the pth stage. 

2.1 Necessary Conditions for Optimal Fixed Rate RVQ 

Let the distortion that results from representing x with y be expressed by d(x,y). 
The distortion measure d(x,y) is assumed to be a non-negative real- valued function 
that satisfies the following requirements: 

1 . For any fixed * € d( x, y) is a continuously differentiable function of y € 

2. d(x,y) is translation invariant. 

3. For any fixed x € &*, d(x, y) is a strictly convex function of y, that is, Vy x , y 2 € 

and A € (0, 1), d (x, Ay x + (1 - A)y 2 ) < A d(x, y x ) + (1 - A )d(x,y 3 ). 


A P-stage RVQ is said to be optimal if it gives at least a locally minimum value of 
the average distortion incurred in representing X\ with *i, 


D(x |, At) = E 



( 1 ) 


For stage codebook and partition optimality, (1) should be minimized with respect to 


stage codebook and partition parameters. However, this minimization is complicated 
by the fact that knowledge of the joint pdf fx^X P ( x l, •••»*#’) * s required, which, 
in turn, depends in a complicated fashion upon the sequence of stage codebooks and 
partitions. This optimization problem can be made tractable by viewing the RVQ 
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product code as a single-stage VQ with a structurally constrained direct-sum code- 
book (i.e., the direct-sum code vectors are structurally dependent). By minimizing 
the average distortion of the direct-sum quantizer, 

D(x u xi) = E {d(x u Q{x x ))} , 

the problem of dealing explicitly with the complicated structural interdependencies 
that exist among the stages of the RVQ is avoided. 

First, to derive optimality conditions for a fixed rate RVQ direct-sum partition, 
assume that the stage codebooks {Ci,C 2 ,...,C/>} are fixed, which implies that the 
direct-sum codebook C is also fixed. Then 

E{d [*!,(?(*!)]} >e\ min d[x u y(j)] 

ly(J)eC 

That is, no direct-sum partition can yield lower average distortion than the partition 
obtained by the nearest-neighbor mapping. Accordingly, we have the nearest-neighbor 
encoding rule, 

*1 € V m (j) iff d[x\,y(j)} < </[*!, y(fc)] for all k 6 J. (2) 

The optimal direct-sum partition cells are denoted with asterisks, V*(j). 

The next step is to determine necessary conditions for optimal stage code vec- 
tors. For the derivation that follows it is useful to introduce the stage-removed index 
mapping /9 P : J t-» J p , J p = J x x x • • • x J p _i x Jp+j x • • • x J P , defined by 

= Uuh> • • • > jp—i > Jp+i > • • • * jp) 

for j € J. Note that /? p (j) includes all members of j except the pth mem- 
ber, hence the name stage-removed index. This index represents a shortened path 
through the RVQ tree where the pth level branch has been removed, and the re- 
mainder of the path starting with the (p + l)th level branch has been added or 
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grafted back into the tree structure. Hence, each direct-sum code vector y(j), where 
j = • • -Jp-iJpJp+ x, • ■ • ,;» G J, can be written as 

vU) = g{M)) + v p 0p)> 

where 

MU)) = E v.-U) 

i-i 

is the pth stage-removed direct-sum path of the RVQ tree. 

Given a particular X\ € and a fixed RVQ encoding rule, there exists a pth 
stage-removed residual vector defined by 

1 P = *1 - ff(Wi))- 

This residual vector is the difference between the input and the stage-removed direct- 
sum vector. Because the stage-removed residual 7 p is a translation (conditioned on 
the pth stage) of the realization x x of the random vector Xu it is also a realization 
of a random vector T p with associated stage-removed residual probability density 
function /r„(7p )• 

In addition, let H p (j p ) be the set of P-tuple indices corresponding to all direct- 
sum code vectors y{j) that contain y p (jp) in their construction. In other words, 
H p (jp) C J is the set of all indices such that j p € J p is the pth element of j. The 
set H p (j p ) can be used to describe the j p th stage-removed residual equivalence class 
V p (jp) by 

V p (jp) = (J (V(f)-fOW))), (3) 

jefyUp) 

where V(j)~ g(ftp{j)) indicates that all X\ G V (j ) have been translated by y(/?p(j))> 
If V(j) is assumed to be an optimal partition, i.e. V(j) = V*(j), then V p {j p ) = 
V‘(j p ) is an optimal stage- removed residual equivalence class. 
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To determine necessary conditions for optimal stage code vectors, assume that 
the stage partitions "P?, . . . , Fp} are fixed, which implies that the direct-sum 

partition P is fixed. Now let K p be the set of all possible pth stage codebooks C v 
with N p vectors, and let K be the set of all possible direct-sum codebooks formed 
from the K p s with 1 < p < P. Also let F : K ►-+ [0,oo) be the function given by 

F(C) = E E X t |*i € V(j)} pr {*1 € V(j)} , (4) 

jeJ 

for y(j) € C and C 6 K. To find a minimum for the average distortion (4), it 
suffices to find a sequence of codebooks € K\ x K* x . . . x Kp 

and corresponding direct-sum codebook C * € K that minimizes F . Coordinate 
descent algorithms can be used to find such a minimum. These algorithms are based 
on the following procedure: we hold fixed all stage codebooks, except for the pth 
stage codebook, and then we minimize F with respect to C p . This is an iterative 
procedure and is performed for each stage (i.e. all values of p) until F(C) converges 
to a minimum. There are two common forms of implementation [18]. In the first, 
often called the nonlinear Jacobi algorithm, the minimizations with respect to the 
different codebooks {Ci,C 2 ,...,Cp} are carried out simultaneously. Mathematically, 
the nonlinear Jacobi algorithm is described by 

C p (t + 1) = Kgm\nF(Ci(t),...,C p -i(t),C p ,C p+ i(t),....,Cp(t)), (5) 

Cp 

for 1 < p < P. In the second approach, often called the nonlinear Gauss-Seidel 
algorithm, the minimizations are carried out successively for each codebook and may 
be described mathematically by 

C p (t + 1) = argminF(Ci(t + 1),... ,Cp_i(< + . . . ,Cp(t)) , (6) 

c p 
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for 1 < p < P. Let us assume all stage codebooks (except for the pth stage codebook) 
are fixed. Also, let us modify (4) by writing 

F{C) = 13 13 E X 1 {<* [*» >9{Pptf)) + VpUp) I *1 e V'(j)] } pr {*i € V(i)} . 

jp€Jp 0p(j)€Jp 

Using the assumption that the distortion measure is translation invariant, and also 
using (3) together with the law of total probability, we can rewrite the above equation 
as 

F{C) = X) E T p \b { d [%» VpUp)} hr € Wp)} P r {%> € VrUr)} 

jp£Jp 

> 23 inf Er p \j„ {d (7p, «) |7 P € V p {j p )} pr {7 P € Vp(ip)} . (7) 

In [19], it is shown that provided pr {7,, € VpO'p)} ^ 0, there exist y' p {j p ) € (which 
we call stage-removed residual centroids) for the stage-removed residual equivalence 
classes V p (j p ) such that 

/ d [7p, y#,)] h p \ }p {l P )dl p = Jnf J d{-f p , u)f Tp \ ip {l p )d^ p < 00, (8) 

and that the set of all solutions y p (j p ) to (8) is convex, closed, and bounded. Since 
the distortion measure d(x,y) is assumed to be strictly convex in y , the solution is 
unique. In (8) the pdf /r p |j p (7p) is related to the source pdf /j^,(zi) according to 

where I[V(j)] is an indicator function for the direct-sum partition cell V(j), that 
is, I[V{j)} = 1 if *1 € V(j) and I[V(j)] = 0 otherwise. The y p (jp)’s which satisfy 
(8) are generalized centroids of stage-removed residual vectors (i.e., residual vectors 
formed from the encodings of all prior and subsequent RVQ stages). Hence, the second 
condition will be referred to as the stage-removed residual centroid condition. 
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Convergence of the nonlinear Gauss- Seidel algorithm applied to RVQ can now be 
established using a descent approach. 

Proposition 1 : Suppose F is continuously differentiable and convex on K\ x K 2 x 
... x Kp. Furthermore, suppose that for each p € {1, 2, . . . , P}, F is a strictly convex 
function of C p when the other codebooks are held fixed. Let {(Ci(<), . . . ,Cp(<))} with 
t = 0, 1,2, ... be a sequence of stage codebooks generated by the nonlinear Gauss- 
Seidel algorithm. Then, every limit point of {(Ci (/),.. .,Cp(<))} minimizes F over 
K\ x K 2 x . .. x Kp. 

Details of the convergence proof are given in [20]. The proof is based on a de- 
scent approach. In particular, successive minimizations cannot increase the value of 
F[Ci(/),. . . ,Cp{t)\. This shows that F[Ci(t -f 1 Cp(t + 1)] < F[Ci(f),. . .,Cp(<)] 
and implies the convergence of F[Ci(t ), . . . ,£/>(<)] provided that F is bounded below. 
It should be noted that if F is not differentiable, the Gauss- Seidel algorithm may fail 
to converge to a minimum. 

The proof outlined above does not apply to the Jacobi algorithm. Even though 
minimizations with respect to each stage cannot increase the value of F , the fact 
that these minimizations are carried out simultaneously allows the possibility that 
F[Ci(f+l), . . . ,Cp(<+l)] > F[Ci(t ), . . . ,Cp(<)]. However, convergence of the nonlinear 
Jacobi algorithm can be established under suitable assumptions on the new codebook 
selection rule or mapping R: K\ x K 2 x . . . x Kp t-> K\ x K 2 x . . . x Kp , given by 

P(C l5 C 2 , . . . ,Cp) = (C a ,C 2 , . . . ,Cp) - cVF(C l5 C 2 , . . . ,Cp), (10) 

where c is a positive real number and VF denotes the gradient of F [21]. 
Proposition 2: Let F be a continuously differentiable function, let c be a real 
number, and suppose that the mapping F(Ci,C 2 , . . . ,Cp) given by (10) is a contraction 
mapping with respect to the block-max norm B(Ci,C 2 ,.. . ,Cp) = max p ||C P ||p/u> p , 
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where each || • || p is the Euclidean norm on K p and each w p is a positive real number. 
Then, there exists a unique vector (CjjCj, . . . ,Cp) that minimizes F over K\ x Ki x 
. . . x Kp. Moreover, the sequence {(Ci(<), . • . ,Cp(<))} generated by either of the two 
algorithms (described by (5) and (6)) converges to . . . ,Cp) geometrically. For 

proof, see [20]. 

A common distortion measure is the squared error distortion measure defined by 

<*(*,») = II* - vlT = £(*. - y.) 2 > 

i=i 

where || • || denotes the Euclidean norm and x,- and y, are elements of the vectors * 
and y, respectively. This distortion measure can be written in the form 

d(*,y) = p(||*-y||) 

where p(a) = a 2 . Obviously, p is a continuously differentiable and strictly convex 
function on [0, oo) with />(0) = 0. It follows that the squared error distortion measure 
satisfies the requirements (l)-(3) in Section 2.1. Therefore, it can be easily shown 
that F is continuously differentiable and convex on K\ x Ki x . . . x Kp, and that 
F is a strictly convex function of C p . Thus, Proposition 1 guarantees that when the 
squared error distortion is used, the nonlinear Gauss-Seidel algorithm converges to a 
minimum. 

A necessary condition for R(Ci,Cif • to be a contraction mapping is that 
x — c[p{x)\ be a contraction mapping for any positive real number c. It is clear 
that the function p(x) = x 2 does not satisfy such & requirement, and Proposition 
2 cannot be used to guarantee the convergence of the Jacobi algorithm (when the 
squared error distortion measure is used). In fact, computer simulations confirm the 
Jacobi algorithm is not guaranteed to converge, even when the initial vector is close 
to 
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2.2 The RVQ Design Algorithm 

The RVQ design algorithm, introduced in [5], attempts to jointly optimize all stage 
codebooks to minimize the overall reconstruction error of the RVQ subject to a con- 
straint on the number of direct-sum code vectors. It is an iterative procedure that is 
similar to the LBG algorithm. However, unlike the LBG algorithm, the optimization 
of the decoder (assuming the encoder is fixed) is an embedded iterative procedure 
that guarantees that new stage codebooks minimize the overall average distortion 
introduced by the RVQ. Therefore, there are two interlaced iterative procedures: one 
for optimization of the encoder/decoder pair, and another to simultaneously satisfy 
the stage-removed residual centroid condition in all stages. 

Assuming that all stage codebooks are held fixed, the first optimality condition 
(given by (2)) implies that only exhaustive search encoders are guaranteed, in general , 
to generate an optimal direct-sum Voronoi partition. However, exhaustive search 
encoding is usually too expensive. An alternative (but generally sub-optimal) encoder 
is the stage-sequential encoder. Although fast, this encoder is often unable to find the 
best direct-sum code vector, thereby resulting in what may be a significant increase 
in average distortion. Another sub-optimal, but efficient and effective encoder is the 
Af- search encoder. The Af-search technique, introduced in [22] for tree searching, was 
shown to be very efficient when used to search the RVQ tree [23, 15]. The A/- search 
algorithm proceeds one level deeper into the RVQ tree by extending all branches from 
M surviving nodes, and only the best M of these extended branches survive to the 
next level. This procedure continues until the last stage of the codebook is reached, 
and then the code vector of the best path among the final M paths is used. Employing 
Af-search during the optimization of the encoder usually leads to a relatively small 
complexity, but to close- to-optimal performance [23, 15]. 
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Assuming a fixed direct-sum partition P, or equivalently, a fixed set of stage par- 
titions {Vi,V 2 y ...,Vp), the Gauss-Seidel algorithm is used to find the constituent 
codebooks . . .,£/>} with stage code vectors that simultaneously satisfy the 

stage-removed residual centroid condition (8). It is shown above that, for the squared 
error distortion measure, the Gauss-Seidel algorithm always converges to a minimum. 
Therefore, the “decoder-only” iteration used to find a minimizing set of stage code- 
books can only reduce or leave unchanged the average distortion. 

It is shown in [20] that if the encoder yields a Voronoi partition (in the squared 
error distortion sense) with respect to the direct-sum codebook and the Gauss-Seidel 
algorithm is used in the decoder optimization step, the fixed rate RVQ design algo- 
rithm converges monotonically to a fixed point which satisfies necessary conditions 
for minimum squared error distortion. However, it should be emphasized that if a 
sub-optimal encoder is used, then the encoder optimization step may actually in- 
crease the average distortion and monotonic convergence cannot be guaranteed. The 
possibility of a nonmonotonic average squared error distortion raises the issue of how 
to effectively terminate the iterative process. Fortunately, experimental results show 
that the stage- sequential search RVQ design algorithm effectively reduces the aver- 
age distortion with only occasional deviations from monotonicity. Furthermore, the 
A/-search RVQ design algorithm converged monotonically in all our experiments to 
a fixed point. 

3 Variable Rate RVQ 

An optimal variable rate RVQ can be constructed by incorporating the entropy con- 
straint directly into the RVQ design loop. In [1], it is shown that the direct-sum 
codebook constraint can generally be expected to lead to both an increased average 
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distortion and a decreased output entropy. This motivates an RVQ design algorithm 
which finds stage code vectors that minimize the average distortion subject to a con- 
straint on the output entropy of the RVQ. Necessary conditions for optimality of 
variable rate RVQ are derived in the next section, and an entropy-constrained RVQ 
design algorithm which satisfies these conditions is discussed in the following section. 

3.1 Necessary Conditions for Optimal Variable Rate RVQ 

For the direct-sum VQ, let J be set of variable length indices {c(j) : j € J}. The 
direct-sum VQ mapping, Q : h+ C, may be realized by a composition of a variable 

length encoder mapping S : 3t k i— ► J , where 

£(*i) = c(j) if and only if x x 6 

and a variable length decoder mapping P : J *-♦ C where 

V(c{j)) = y(j). 

The variable length encoder can be further decomposed into two mappings, S = LoE, 
where E : %t k »-» J and L : J t-» J, and o denotes composition. Similarly, one can 
decompose the variable length decoder into two mappings, D = Do (L)" 1 , where 
(L) -1 : J t-» J, and D : J « C. Note that the mapping L is one-to-one and onto, 
and hence, is an invertible mapping with inverse (L) _1 . 

Let the distortion that results from representing x x with * 1 , d(x be a 
non-negative real- valued function that satisfies requirements (l)-(3) of Section 2.1. 
According to distortion-rate theory [24],[25j, [26], the fcth-order distortion function 
(where k is the vector size) 

D k (R) = inf {E[d(x u x x )] \ 7(« m ;* a ) < R } 
pr{*i|*i) 
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is a lower bound to the fcth-order operational distortion-rate function 
D„(R) = inf {£[rf(*„«i)l | £[/(*,)] < R) 

(c,V) 

where l(x\) = |£T(as 1 )| is the length of the codeword representing x x and I(xi\Xi) is 
the mutual information between X\ and X\. The convex hull of Dk(R) can be found 
[16] by minimizing the functional 

J(£,V) = E[d(x 1 ,x l )] + AE[/(*x)] 

where A can be interpreted as the slope of a line supporting the convex hull of the 
operational distortion-rate function f)k(R )• 

A variable rate P-stage RVQ (with an average rate no greater than R) is said 
to be optimal for fx ,(•) if it gives at least a locally minimum value of the average 
distortion. The design problem can be stated as follows: Choose the codebook C, 
partition P, and variable- length mapping L that minimize the average distortion 

D{x u x i) = E{d(x u Q(x x))} 

subject to 

£{<(*.)} < R, 

where /:&*■-» is the variable length of the codeword representing * 1 , and is 
defined by 

*(•«) = |£(*i)l = NB(«i))l = W)l 

This constrained minimization problem can be replaced by the following uncon- 
strained minimization problem: Choose the codebook C, partition P, and variable 
length mapping L that minimize the Lagrangian 

Ja(E,L,D) = E {d(x ly xi) + A |L(i)|} . (11) 
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Proceeding, assume the codebooks {Ci,C 2 , . . . ,Cp] are fixed. This implies the 
direct-sum codebook C is fixed. Also, assume the lengths |L(j)| of the channel 
codewords associated with the direct-sum code vectors are fixed. Then, a partition JP 
that minimizes (11) is one that minimizes the integrand d(x i,*i) + A |L(j)| almost 
everywhere. That is, 

«, € V(j) iff d[x u y(j)]+X |L(j)| < [*i, v(*))+ A |L(fc)| for all fc € J. (12) 

Note that (2) is a special case of (12) when A = 0. 

Next, assume the codebooks {Ci,C 2 , ■ ,Cp} and the partitions {VuV ?, . . . ,Vp} 
are fixed. This implies that both the direct-sum codebook C and the direct-sum 
partition P are fixed. Then, note that (11) can be expressed as 

Jx( E,L,D) = £ E{d[x uV {j) ] + A|L(j)| | ** 6 V(j)} pr(j) (13) 

hJ 

where pr(j) = pr{*i € V(j)}. A mapping L that minimizes (13) is one that mini- 
mizes the expected codeword length 

R= E Mi) I P r (i)- 

hJ 

Setting the codeword length |L(j)| to 

|L*(i)| = - log 2 pr(i) = - log 2 P r (ii Ja. • • • Jp) ( 14 ) 

results in an average rate which is equal to the output entropy of the direct-sum 
quantizer. 

The probability pr(;'i, j 2 , • • • , jp) of a path in the RVQ can also be written as the 
product of conditional probabilities, i.e., 

pr(ji, ji,’ ■ ■ Jp) = P r (jp|jp-i) • • • » ji) pr(jp-i | jp- 2 , • • • , ji) • • • P r (j 2 |ii) pr(ji) 
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Therefore, we can write 


|L*(j)| = - log 2 pr(jp|jp-i , . . . ,ii) - log 2 pr(jp-i |jp- 2 , . . • , ji) 

- ... - log 2 pr(; 2 |j'i ) - log 2 pr(;'i ) (15) 

and the output entropy of the optimal direct-sum RVQ can be written as 

p 

. . ., Jp) = E H ( J p \ J„_,, . . ., Ji). 

p=i 

Finally, assume the stage partitions {P\,V 2 . . . ,Vp) are fixed. This implies the 
direct-sum partition P is fixed. Also, assume that the lengths |L(j)| of the channel 
codewords associated with the direct-sum code vectors are fixed. Then, rewrite (11) 
as 

A(E,L,D) = Y, £{<i[*i,D«)] | x. 6 V(j)} pr(j) + 

J 

A E ^{l L (i)l l*i€V(i)} pr(i). (16) 

hJ 

Clearly, a mapping D that minimizes (16) is one that minimizes 

E E{d[x u D(j)] | xi € V(i)} pr(j). 

UJ 

To achieve this minimum, the multistage code vectors y p (j p ) at the pth stage must 
satisfy (8), i.e., 

J d [ 7 ,, »{(*)] /r P |j P (7p)^7p = Jnf J d(7p,u)/r pUf> (7p)^7p, (17) 

where 7 P = *i “ g(P P (j)), and fr r \j P {l P ) is defined by (9). 

3.2 The EC-RVQ Design Algorithm 

The EC-RVQ design algorithm proposed here is an iterative descent algorithm sim- 
ilar to the one used for the design of EC-VQ codebooks. Each iteration consists of 
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applying the transformation 

(E (t + 1), L(< + 1), D(t + 1)) = T(E(<), L(t), D (<)) 

where 

E(< + 1) = arg min(E, L(<), D(t)) (optimum partitions) 

L(t + 1) = arg min(E(t + 1), L, D(t)) (optimum codeword lengths) 

L 

D(< + 1) = arg min(E(< + l),L(f + 1),D) (optimum code vectors) 

Following the lines of argument of [27], one can show that every limit point of the 
sequence (E(f),L(t),D(f)), t = 0,1,..., generated by the transformation T mini- 
mizes the Lagrangian Ja(E,L,D) (as given by (11)). Therefore, the EC-RVQ design 
algorithm is guaranteed to converge to a local minimum. 

To find several points on the convex hull of the operational rate-distortion curve, 
the minimization of «/*(E,L,D) is repeated for various A’s. Starting with A = 0 
(which corresponds to the RVQ codebook designed by the fixed rate RVQ design 
algorithm), the EC-RVQ design algorithm uses a pre-determined sequence of A’s to 
design locally optimal variable rate EC-RVQ codebooks. 

For optimal performance, the EC-RVQ design algorithm must generally employ 
an exhaustive-search encoder, a jointly optimized direct-sum decoder, and an optimal 
entropy coder as described by (14). Unfortunately, the computational complexity and 
memory requirements associated with optimal EC-RVQs are usually prohibitive, and 
sub-optimal design procedures are usually used to generate practical EC-RVQs. 

As with fixed rate RVQ design algorithms, the encoder does not necessarily have 
to be optimal to be useful. Sub-optimal tree-structured searching techniques such 
as stage-sequential searching or multipath searching can be employed, leading to 
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relatively fast encoder implementations. Experimental results indicate that stage- 
sequential searching usually leads to a significant increase in average distortion, but 
multipath M-searching can result in a close-to-optimal performance, even with values 
of M as small as 2 or 3 [23, 15]. 

Ideally, all stage codebooks in the RVQ should be jointly optimized. However, 
since the complexity of the joint optimization design process increases rapidly (quadrat- 
ically) with increasing number of stages, the RVQ design effort can become exces- 
sive. The complexity of the design can be greatly reduced by using conventional 
stage-sequential optimization, but the resulting performance can also be significantly 
reduced. The performance gap between sequential and joint optimization can be 
bridged by local joint optimization of the stage codebooks. The optimization is local 
in the sense that the stages are partitioned into overlapping blocks and the joint op- 
timization process is restricted to only the stages of each block. This technique was 
previously employed to accelerate the design of large-block fixed rate RV Q codebooks 
with a relatively large number of stages [28]. However, we also note that, unlike fixed 
rate RVQ, EC-RVQ (with a modest number of stages) is shown experimentally to 
generally perform quite well when sequential stage-wise optimization is used. This 
encouraging result implies that, at moderate bit rates, the EC-RVQ design speed can 
be substantially increased without significantly impairing performance. 

A unique complexity reducing feature of EC-RVQ is its potential to use stage- 
conditional (i.e., conditioned on previous stages) entropy tables of relatively small 
sizes. Equation (15) shows that the optimal length (given by (14)) of the variable 
length codeword associated with an index j € J is also the sum of P stage-conditional 
self-information components. During the design process, the lengths of the stage- 
conditional entropy codewords can be estimated by using a sufficiently large training 
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set. Clearly, the aggregate number of tables of stage-conditional entropy codes can 
become extremely large as the number of stages increases, which may offset the mem- 
ory savings obtained by using the RVQ structure. However, the number of tables can 
be made relatively small by limiting the number m of previous stages upon which 
conditioning is based. In other words, the direct-sum codeword length |L(j)| is ap- 
proximated by 


|L(j)| m = - log 2 pr (j P \jp-i , . . . , jp- m ) - log 2 pr(j>_i |j>_ 2 , . . . , j P - m ) 

- log 2 pr(j 2 |j'i) - log 2 pr(ji). (18) 

Obviously, since H[Jp\J p — j, . . ., ^ H[Jp\*Jp— i, . • ni) for each p — 1,2, ...,F 

and m < p— 1, it is easy to show that H m (J) = H{J v \J p -\,. . .,Jp- m ) > H{J). 

Experimental results show that the value of m that results in a good complex- 
ity /performance tradeoff increases with both increasing number of stages and vector 
size, but decreases with increasing stage codebook size. Recent results also show that 
the best value for m depends heavily on the source. For sources with memory, the 
best value of m is usually small (0 < m < 2). For memoryless sources, however, a 
larger value of m is usually needed for a good tradeoff, which results in increased 
memory requirements. 

While the sub-optimal EC-RVQ design algorithms discussed above are not guaran- 
teed to converge to local minima, they provide good complexity /performance trade- 
offs, and they facilitate the design of practical EC-RVQs. We also point out that the 
sub-optimal algorithms employed in all EC-RVQ experiments performed in this work 
converged monotonically to a fixed point, and occasional deviations from monotonic- 
ity were observed only when stage-sequential searching was used during the encoding 
step of the EC-RVQ design. 
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4 Experimental Results 


Many quantization techniques have been used to code Gaussian, Laplacian, and uni- 
form memoryless sources, as well as Gauss-Markov sources. Table 1 shows some of 
the well-known coders as compared qualitatively with EC-RVQ in terms of encoder 
complexity and memory. For the class of VQ-based coders, EC-RVQ is less demand- 
ing in terms of both memory and encoding complexity. It has comparable encoding 
complexity and memory requirements to that of EC-TCQ but does not suffer from 
the relatively large coding delays associated with large trellises. Finally, it should 
be noted that when the dimension is one, EC-RVQ, or entropy-constrained residual 
scalar quantization (EC-RSQ), has the simplest encoding complexity and the smallest 
memory requirements. 

In this paper we report on the relative performance of these coding techniques 
for memoryless Gaussian, Laplacian, and uniform sources as well as a Gauss-Markov 
source. Experimental results demonstrate the performance of EC-RVQ and show 
its advantages and disadvantages when compared to some of the competitive cod- 
ing techniques that have appeared in the literature. In particular, EC-RVQ perfor- 
mance is compared to that of scalar quantization (SQ), entropy-constrained SQ (EC- 
SQ), entropy-constrained VQ (EC-VQ), trellis coded quantization (TCQ), entropy- 
constrained TCQ (EC-TCQ), and lattice-based VQs. For each of the sources con- 
sidered here, the EC-RVQs, the EC-RSQs, and the EC-SQs, which are described in 
Table 2, were designed on training sequences rather than on the underlying distri- 
butions, and were used to encode a test sequence of 40,000 samples taken from the 
same source. The performance results for EC-VQ [16], TCQ [29], EC-TCQ, predic- 
tive EC-SQ (PEC-SQ), predictive EC-TCQ (PEC-TCQ) [30], and lattice-based VQs 
[31, 16] are taken from the literature. 
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Experimental results for a Gaussian random variable with zero mean and unit 
variance are shown in Figure 3 (top) and Table 3. Figure 3 (top) shows the rate- 
distortion performance for the various EC-RVQs and EC-SQ relative to the R(D) 
curve. Signal-to-noise ratio (SNR) values for EC-RVQ, EC-VQ, EC-SQ, D4 lattice, 
A2 lattice, TCQ, EC-TCQ, and R(D) at 0.5, 1.0, 1.5, and 2.0 bits per sample (bps) 
are given in Table 3. It can be seen that the performance of EC-RVQ increases with 
increased vector size, and that practical EC-RVQs outperform practical EC-VQs with 
the same vector size, even while maintaining relatively small encoding complexity and 
memory requirements. EC-RVQ is also competitive with both TCQ and EC-TCQ. 

The next set of experiments considers the Laplacian source with zero mean and 
unit variance. Figure 3 (bottom) shows the rate-distortion performance of several 
EC-RVQs and EC-SQ relative to a curve linearly interpolated from well-known R(D) 
points. Numerical values are given in Table 4 for EC-RVQ, EC-RSQ, EC-SQ, TCQ, 
EC-TCQ, SQ, VQ, and R(D) at 0.5, 1.0, and 2.0 bps. Unlike the case of the Gaussian 
source, increasing the vector size does not improve the EC-RVQ rate-distortion per- 
formance significantly. This is explained by the fact that, as the vector size increases, 
encoding complexity and memory requirements limit the size of the initial codebook 
(or the peak bit rate) that can be used to design practical EC-RVQs. This leads to a 
reduction in rate-distortion performance because the Laplacian source (which has a 
peaked distribution) requires a very large output alphabet size (i.e., number of levels 
or code vectors), which is difficult to attain in practice. In fact, EC-RSQ is very 
competitive with EC-RVQ because the former has the potential to use an expanded 
set of direct-sum code vectors. When compared to other coding techniques, EC-RVQ 
(including the special case where the vector size Jfc is equal to 1) outperforms the other 
coders at low bit rates and is competitive with EC-TCQ at high rates. 
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Simulation results for the encoding of a memoryless uniform source are shown in 
Figure 4 (top) with numerical values given in Table 5. As stated in [29], entropy 
coding does not lead to any performance gains in the case of scalar or trellis coded 
quantization. However, although the source is uniform, RVQ outputs are generally not 
equiprobable, and entropy coding usually leads to a slight performance gain. As can 
be seen, increasing the vector size leads to an increase in rate-distortion performance. 
However, EC-RVQ performance generally falls below that of TCQ [29], but becomes 
competitive when the the vector size is relatively large (e.g., k = 16). 

Finally, results for a Gauss-Markov source with correlation coefficient p = 0.9 are 
shown in Figure 4 (bottom) and Table 6. Again, Figure 6 shows the rate-distortion 
performance of several EC-RVQs and EC-SQ relative to R(D) while Table 6 shows 
the SNRs for a number of predictive coding techniques as well as EC-RVQ and EC- 
VQ at bit rates of 0.5, 1.0, 1.5, 2.0 and 2.5 bps. It should be noted that for rates 
R > 0.926, the R(D) curve in Figure 4 (bottom) is actually an upper bound on the 
true derived R(D) curve. As expected, there is a clear advantage of VQ-based coders 
over most of the other non-predictive scalar coders. Although EC-VQ is expected 
to theoretically outperform all VQ-based coders for such a source, practical EC-VQs 
do not meet that expectation, mainly because the encoding complexity and memory 
requirements associated with such coders severely limit the initial codebook size (or 
the peak bit rate). In fact, EC-VQ is significantly outperformed by EC-RVQ with 
the same vector size. For vector sizes larger than 6, EC-RVQ outperforms PEC-SQ 
at all bit rates between 0.5 and 2.0 bits/sample, and is competitive with PEC-TCQ, 
especially at relatively large vector sizes (e.g., k = 16). It should be noted that the 
memory inherent in both the state and the predictor gives PEC-TCQ an effective 
vector size which is usually larger than the vector sizes used by the VQ-based coders. 
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5 Summary 


Necessary conditions for optimal variable rate RVQ have been derived, and an itera- 
tive descent algorithm for designing locally optimal variable rate EC-RVQ codebooks 
has been introduced. The RVQ structure is exploited to facilitate the implementation 
of practical EC-RVQs, which perform well even while maintaining very low encoding 
complexity and memory requirements. 

Experimental results for three memoryless sources and a Gauss-Markov source 
indicate that practical EC-RVQs have performance advantages over other VQ-based 
coders, including practical EC-VQs. Although EC-RVQ outperforms TCQ-based 
coders only at some relatively low bit rates for the Laplacian source, it is usually 
competitive and has the potential of increased rate-distortion performance when the 
peak bit rate is increased. Furthermore, encoding complexity and memory require- 
ments of EC-RVQ are comparable to those of TCQ-based coders, but EC-RVQ does 
not have the disadvantage of the long encoding delays associated with large trellises. 
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System 

Block(Vector) Size 

Encoding 

Memory 

Entropy Coder 

EC-SQ 

1 

simple 

very small 

very simple 

EC-RSQ 

1 

very simple 

very small 

very simple 

A2 Lattice 

2 

moderate 

small 

simple 

D4 Lattice 

4 

moderate 

small 

simple 

EC-VQ 

4 

complex 

large 

complex 

EC-VQ 

8 

very complex 

large 

complex 

EC-RVQ 

4 

simple 

small 

simple 

EC-RVQ 

8 

moderate 

small 

simple 

EC-RVQ 

16 

complex 

moderate 

moderate 

EC-TCQ(s=8) 

1 

simple 

small 

simple 

EC-TCQ(s=8) 

4 

moderate 

small 

moderate 

EC-TCQ(s=128) 

1 

moderate 

moderate 

moderate 


Table 1: Qualitative comparison of several entropy-coded quantization systems 
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EC-RVQ 

EC-RSQ 

EC-SQ 

k=4 

k=6 

k=8 

k=12 

k=16 



wmm 








5 

5 

mm 

8 

3 

1 

SCS 

16 

16 

16 



4 

16 

PBR 

4.00 

3.33 

2.50 





NSP 

2 

2 

2 

3 

3 

1 

1 

MMO 

1 

2 

2 

2 

2 

1 


NVDC 

128 



288 


12 





2.56 

4.61 

8.19 

0.48 

0.64 



6.28 

6.28 

8.33 

12.42 

0.09 

0.08 


Table 2: Training set size (TSS) in thousands of vectors, number of stages (NS), 
stage codebook size (SCS) in vectors, peak bit rate (PBR) in bits/sample, number 
of search paths (NSP), Markov model order (MMO), number of vector distortion 
calculations (NVDC) per input vector, codebook memory (CM) in kilobytes, and 
maximum memory requirements for entropy tables (TM) in kilobytes for EC-RVQ, 
EC-RSQ, and EC-SQ. 
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Rate 

EC-RVQ 

EC-VQ 

EC-SQ 

D4 

A2 

TCQ 

EC-TCQ 

R(D) 


<M 

f-H 

II 

msa 

s=256 

s=128 

0.5 

2.21 

2.50 

2.20 

2.09 

2.05 

2.17 

2.78 

N/A 

3.01 

1.0 

5.10 

5.38 

4.80 

4.64 

4.55 

4.78 

5.56 

5.50 

6.02 

1.5 

7.80 

8.21 

7.70 

7.57 

6.95 

7.60 

N/A 

8.79 

9.00 

2.0 

10.68 

N/A 

N/A 

10.55 

N/A 

N/A 

11.04 

11.83 

12.04 


Table 3: Performance (SNR in dB) of various source coding schemes for the memo- 
ryless Gaussian source at 0.5, 1.0, 1.5, and 2.0 bits per sample. 


Rate 

EC-RVQ 

EC-RSQ 

EC-SQ 

TCQ 

EC-TCQ 

SQ 

VQ 

R(D) 

151 

■jg* 

s=256 

8=128 

Isa 

0.5 

3.23 

3.27 

3.15 

3.14 

2.20 

N/A 

N/A 

1.97 

N/A 

1.0 

5.91 

5.92 

5.90 

5.79 

5.54 

4.82 

p|i|H 

4.96 

6.62 

2.0 

11.38 

11.58 

11.50 

11.31 

11.22 

12.35 


N/A 

12.66 


Table 4: Performance (SNR in dB) of various source coding schemes for the memo- 
ryless Laplacian source at 0.5, 1.0, and 2.0 bits per sample. 
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Rate 

EC-RVQ 

EC-SQ 

TCQ 

SQ 

R(D) 

■ 

k=16 

II 

GO 

8=256 


0.5 

3.12 

3.20 

3.08 

2.84 

3.24 

N/A 

N/A 

1.0 

6.27 

6.39 

6.04 

6.22 

6.58 

6.02 

6.79 

2.0 

12.27 

12.79 

12.08 

12.62 

13.00 

12.04 

13.21 

3.0 

18.58 

N/A 

18.10 

18.83 

19.23 

18.06 

19.42 


Table 5: Performance (SNR in dB) of various source coding schemes for the memo- 
ryless uniform source at 0.5, 1.0, and 2.0 and 3.0 bits per sample. 


Rate 

EC-RVQ 

EC-VQ | 

PEC-SQ 

PEC-TCQ 

R(D) 



r- 1 

II 

^4 

151 

00 

II 

^4 

OB 

II 

00 

0.5 

7.45 

8.43 

9.32 

7.10 

8.15 

N/A 

N/A 

10.22 

1.0 

10.64 

11.58 

12.36 

10.40 

11.15 

N/A 

N/A 

13.23 

1.5 

| F*jj| 

14.29 

15.29 

12.15 

N/A 

13.86 

15.30 

16.26 

2.0 

16.15 

17.23 

N/A 

15.80 

N/A 

17.22 

18.38 

| JjgjJ 

2.5 

19.14 

20.13 

N/A 

N/A 

N/A 

20.48 

21.41 

ggj 


Table 6: Performance (SNR in dB) of various source coding schemes for the Gauss- 
Markov source at 0.5, 1.0, 1.5, 2.0 and 2.5 bits per sample. 
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Figure 1: A P-stage residual vector quantizer 


NODE 



y(i) vtf) V(3) yW y(«) y<«) vD y(«) 


Figure 2: A 3-level RVQ tree 
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SignaMo-noise ratio (dB) Signal-to-noise ratio (dB) 




Figure 3: The R(D) performance of several EC-RVQs and EC-SQ relative to the true 
R(D) curve for the Gaussian (Top) and the Laplacian (Bottom) memoryless sources. 
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Average bit rate (bits/sample) 

Figure 4: The R(D)performance of several EC-RVQs and EC-SQ relative to the true 
R(D) curve for the uniform source (Top) and the Gauss-Markov source (Bottom). 







